spontaneous behavior
Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models
Li, Weiqin, Yang, Peiji, Zhong, Yicheng, Zhou, Yixuan, Wang, Zhisheng, Wu, Zhiyong, Wu, Xixin, Meng, Helen
Spontaneous style speech synthesis, which aims to generate human-like speech, often encounters challenges due to the scarcity of high-quality data and limitations in model capabilities. Recent language model-based TTS systems can be trained on large, diverse, and low-quality speech datasets, resulting in highly natural synthesized speech. However, they are limited by the difficulty of simulating various spontaneous behaviors and capturing prosody variations in spontaneous speech. In this paper, we propose a novel spontaneous speech synthesis system based on language models. We systematically categorize and uniformly model diverse spontaneous behaviors. Moreover, fine-grained prosody modeling is introduced to enhance the model's ability to capture subtle prosody variations in spontaneous speech.Experimental results show that our proposed method significantly outperforms the baseline methods in terms of prosody naturalness and spontaneous behavior naturalness.
Eight Reasons to Prioritize Brain-Computer Interface Cybersecurity
Brain-computer interfaces (BCIs) are bidirectional systems that interact with the brain, allowing neuronal stimulation as well as the acquisition of neural data. Being invasive interfaces extensively used in medical therapy, BCIs can be classified according to their invasiveness level. In this sense and as an example, invasive BCIs focused on neural recording have been used to control prosthetic limbs in impaired patients, while BCIs for neuromodulation have been helpful for treating neurodegenerative conditions, such as Parkinson's disease.9 The second main family of BCIs, in terms of invasiveness, is the non-invasive one. BCIs based on non-invasive principles and, mainly, those focused on neural data acquisition such as electroencephalography (EEG), have gained popularity in recent years, extending their usage from traditional medical scenarios to new domains, such as entertainment or video games. However, despite the benefits of non-invasive BCIs, some works in the literature have identified cybersecurity issues from a neural data acquisition perspective. Martinovic et al.19 demonstrated that an attacker could obtain sensitive personal data from BCI users, taking advantage of their cerebral responses (P300 potentials) when presented with known visual stimuli. Bonaci et al.1 also described a scenario where attackers could maliciously add or modify software modules that cause the BCI to take dangerous action against users. Finally, Takabi et al.24 highlighted that most APIs used to develop BCI applications offered complete access over the information acquired by the BCI, presenting confidentiality problems. Cybersecurity of invasive BCIs is also a challenge that has been identified in the literature and whose application is in its initial stages3,4,8 This situation is complicated by the recent introduction of novel BCI designs based on nanotechnology aiming to surpass the limitations of traditional BCIs.